Picture for Zhuotao Tian

Zhuotao Tian

Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models

Add code
May 08, 2025
Viaarxiv icon

DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception

Add code
May 07, 2025
Viaarxiv icon

From Mapping to Composing: A Two-Stage Framework for Zero-shot Composed Image Retrieval

Add code
Apr 25, 2025
Viaarxiv icon

Generalized Kullback-Leibler Divergence Loss

Add code
Mar 11, 2025
Viaarxiv icon

Referencing Where to Focus: Improving VisualGrounding with Referential Query

Add code
Dec 26, 2024
Figure 1 for Referencing Where to Focus: Improving VisualGrounding with Referential Query
Figure 2 for Referencing Where to Focus: Improving VisualGrounding with Referential Query
Figure 3 for Referencing Where to Focus: Improving VisualGrounding with Referential Query
Figure 4 for Referencing Where to Focus: Improving VisualGrounding with Referential Query
Viaarxiv icon

VisionZip: Longer is Better but Not Necessary in Vision Language Models

Add code
Dec 05, 2024
Figure 1 for VisionZip: Longer is Better but Not Necessary in Vision Language Models
Figure 2 for VisionZip: Longer is Better but Not Necessary in Vision Language Models
Figure 3 for VisionZip: Longer is Better but Not Necessary in Vision Language Models
Figure 4 for VisionZip: Longer is Better but Not Necessary in Vision Language Models
Viaarxiv icon

Typicalness-Aware Learning for Failure Detection

Add code
Nov 04, 2024
Figure 1 for Typicalness-Aware Learning for Failure Detection
Figure 2 for Typicalness-Aware Learning for Failure Detection
Figure 3 for Typicalness-Aware Learning for Failure Detection
Figure 4 for Typicalness-Aware Learning for Failure Detection
Viaarxiv icon

Explore the Potential of CLIP for Training-Free Open Vocabulary Semantic Segmentation

Add code
Jul 11, 2024
Viaarxiv icon

Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models

Add code
Jul 07, 2024
Figure 1 for Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models
Viaarxiv icon

Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs

Add code
Jun 26, 2024
Figure 1 for Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs
Figure 2 for Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs
Figure 3 for Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs
Figure 4 for Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs
Viaarxiv icon